Communications Medicine
○ Springer Science and Business Media LLC
Preprints posted in the last 90 days, ranked by how well they match Communications Medicine's content profile, based on 85 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.
Lakhani, S.
Show abstract
This study analyzes 794,811 digitized medical examina- tions from Indian life-insurance applicants, a working-age, urban-skewed demographic often undersampled by national surveys. The cohort exhibits a pronounced South-Asian car- diometabolic risk profile: among valid adult records, 41.9% met the criteria for dyslipidemia (driven heavily by low HDL and elevated triglycerides), and 61.4% met AHA 2017 crite- ria for stage 1 hypertension. However, canonicalizing this dataset across 33,244 diagnostic centers revealed significant heterogeneity in laboratory reference ranges. At the clinical prediabetes threshold of 110 mg/dL for fasting blood sugar, the record-pair disagreement rate across laboratories was 49.7%, with similar variance across other common tests. This structural inconsistency materially affects patient classi- fication and the tracking of disease prevalence, underscoring a critical need for the national standardization of laboratory reporting in India
Yuan, S.; McVey, J. C.; Hartmann, K.; Abramowitz, S.; Woerner, J.; Shakt, G.; Judy, R.; Douglas, J. E.; Voight, B. F.; Kohanski, M. A.; Cohen, N. A.; Levin, M.; Damrauer, S. M.
Show abstract
Background Chronic rhinosinusitis (CRS) and nasal polyps (NP) are closely related inflammatory airway diseases, and their co-occurrence is often associated with more persistent symptoms, frequent recurrence, and substantial respiratory morbidity. However, the extent to which CRS without and with NP (CRSsNP and CRSwNP) share genetic susceptibility-and which genetic mechanisms are disease-specific-remains poorly characterized. Methods We conducted cross-population genome-wide association meta-analyses of overall CRS (including both CRSwNP and CRSsNP) and NP (a proxy for CRSwNP) using data from six biobanks. We estimated genome-wide genetic correlations between overall CRS, CRSwNP, and a spectrum of respiratory diseases. We applied five complementary gene-prioritization strategies to nominate CRS- and CRSwNP-associated genes and performed pathway enrichment analyses to infer implicated biological processes. For CRSwNP, we integrated single-cell transcriptomic data to characterize cell-type-specific expression of prioritized genes and used stratified LD score regression to quantify heritability enrichment across immune and epithelial annotations. To delineate shared versus disease-specific genetic signals, we performed three comparative analyses-local genetic correlation, CRSwNP-CRS colocalization, and genomic structural equation modeling. Finally, we performed proteome-wide Mendelian randomization to identify circulating proteins with putative causal effects on CRS and CRSwNP. Results This GWAS meta-analysis identified 96 genome-wide significant loci for CRSwNP and 41 for overall CRS, prioritizing 92 and 39 candidate genes, respectively. CRSwNP and overall CRS showed shared genetic susceptibility (rg = 0.59; P = 6.8e-16), while CRS exhibited broader genetic correlations across multiple respiratory disorders. Pathway analyses consistently implicated immune signaling albeit with disease-specific emphases and lipid-metabolism networks. Single-cell analyses demonstrated distinct expression of CRSwNP-prioritized genes across nasal epithelial and immune cell clusters, and immune annotations explained more CRSwNP heritability (enrichment score = 4.1; P = 0.010) than epithelial annotations (2.5; P = 0.072). Comparative genetic analyses highlighted multiple shared loci-including BACH2, CD247, FADS2, FOXP1, FUT2, GPX4, IL7R, NDFIP1, RAB5B, RORA, SMAD3, TSLP - as well as 3 CRSwNP-specific and 6 CRS-specific loci. Proteome-wide MR identified 10 and 8 putatively causal circulating proteins for CRSwNP and overall CRS, respectively, with protein TNFSF11, IL2RB, and STX4 associated with both conditions. Conclusions This multi-population GWAS meta-analysis expanded genetic discovery for CRS and CRSwNP and showed substantial shared liability with distinct disease-specific components. Immune components explained a larger proportion of CRSwNP heritability than epithelial annotations, reinforcing the primacy of immune-driven mechanisms in polyp disease.
Sehgal, N. K. R.; Tronieri, J. S.; Ungar, L.; Guntuku, S. C.
Show abstract
Social media can reveal patient experiences with glucagon-like peptide-1 receptor agonists (GLP-1 RAs) that extend beyond clinical trial data. We analyzed 410,198 Reddit posts (May 2019-June 2025) mentioning semaglutide or tirzepatide. A total of 67,008 users self-reported using these medications, and 43.5% described at least one side effect. Gastrointestinal symptoms predominated, including nausea (36.9%), fatigue (16.7%), vomiting (16.3%), constipation (15.3%), and diarrhea (12.6%). Notably, reproductive symptoms (e.g., menstrual irregularities) and temperature-related complaints (e.g., chills, hot flashes) emerged as unrecognized potential effects. These findings highlight patient concerns not well captured in current labeling or trials. Large-scale social media analysis can complement traditional pharmacovigilance by detecting emerging safety signals and expanding understanding of the real-world safety profile of GLP-1 RAs.
Khattab, A.; Wang, Z.; Srinivasasainagendra, V.; Tiwari, H. K.; Loos, R.; Limdi, N.; Irvin, M. R.
Show abstract
BackgroundDiabetic kidney disease (DKD) is a leading cause of kidney failure in individuals with type 2 diabetes (T2D), yet risk identification in routine clinical practice remains incomplete. A critical and often overlooked barrier is risk observability: how much of a patients underlying risk is actually captured in their clinical record at the time of screening. Existing prediction models evaluate performance using model-specific thresholds, making it difficult to understand how additional data sources alter real-world screening behavior or which individuals benefit when models are expanded. MethodsWe developed a series of five nested machine learning models evaluated at a one-year landmark following T2D diagnosis using data from the All of Us Research Program (N = 39,431; cases = 16,193). Each successive model added a distinct information layer -- intrinsic risk, laboratory snapshots, medication exposure, longitudinal care trajectories, and social determinants of health (SDOH) -- while retaining all prior features. All models were evaluated under a fixed screening policy targeting 90% specificity, so that the false positive rate remained constant as the information available to the model grew. External validation was conducted in the BioMe Biobank (N = 9,818) without retraining. ResultsDiscrimination improved consistently across layers, from AUROC 0.673 (M1) to 0.797 (M5). Under the fixed screening policy, sensitivity nearly doubled from 0.27 to 0.49, with a cumulative recovery of 30.4% of cases missed by the base model. Gains were driven by distinct subgroups at each transition: laboratory features identified biologically high-risk individuals; medication features captured those with high treatment intensity reflecting advanced cardiometabolic burden; longitudinal care trajectory features rescued cases with biological instability observable only through repeated measurements; and SDOH features recovered individuals with limited clinical observability, with rescue probability highest among those with the fewest recorded monitoring domains. Sparse data in the clinical record indicated low observability, not low risk. Social and genetic features each contributed most when downstream physiologic signal was limited, supporting a contextual rather than universal role for each. In BioMe, discrimination was attenuated (M4 AUROC 0.659), but the relative ordering of information layers was fully preserved, and a systematic upward shift in predicted probability distributions underscored the need for recalibration before deployment in a new setting. ConclusionsDKD risk detection in T2D is substantially improved by integrating complementary information layers under a fixed clinical screening policy, with gains arising from distinct domains that identify at-risk individuals in different clinical contexts. The layered landmark framework introduced here reveals how risk observability -- shaped by monitoring intensity, healthcare engagement, and access -- determines what a screening model can detect, and provides a foundation for context-aware EHR-based screening that accounts for data availability at the time of risk assessment. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=140 SRC="FIGDIR/small/26351384v1_ufig1.gif" ALT="Figure 1"> View larger version (51K): org.highwire.dtl.DTLVardef@175bfc4org.highwire.dtl.DTLVardef@181170dorg.highwire.dtl.DTLVardef@108c98org.highwire.dtl.DTLVardef@7e5c86_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstract.C_FLOATNO Study design and layered DKD screening framework The top row defines the cohort timeline, in which predictors are derived from clinical data collected between T2D diagnosis and the 1-year landmark, and incident DKD is ascertained after the landmark. The second row depicts the nested model architecture, in which five successive models sequentially incorporate intrinsic risk, laboratory snapshot features, medication exposure, longitudinal care trajectories, and social determinants of health, while retaining all features from prior layers. The third row summarizes model development in the All of Us Research Program (N = 39,431) and external validation in the BioMe Biobank (N = 9,818), where the same trained models and risk thresholds were applied without retraining. The bottom row highlights the three evaluation domains: predictive performance, fixed-policy screening, and missed-case recovery context. DKD, diabetic kidney disease; T2D, type 2 diabetes; PRS, polygenic risk scores; AUROC, area under the receiver operating characteristic curve; AUPRC, area under the precision-recall curve; PPV, positive predictive value; SHAP, SHapley Additive exPlanations. C_FIG
Gong, L.; Aswani, N.; Shahinian, P.; Yang, J. Y.; Kontos, D.; Manji, G.; Kang, S.; Hur, C.
Show abstract
Electronic health record (EHR) prediction models often summarize longitudinal histories as static patient-level features, which may omit potentially informative event ordering. We developed a simplified spike-timing-dependent plasticity (STDP)-inspired framework that represents asynchronous EHR data as sparse, directional transition features. The approach encodes whether one clinical event precedes another within prespecified temporal windows, preserving event identity, directionality, and approximate timing while retaining feature-level interpretability. We evaluated this framework in two retrospective prediction tasks with different temporal scales: incident acute kidney injury (AKI) prediction in 17,351 MIMIC-IV ICU stays and early postoperative recurrence prediction in 713 CUMC patients with pancreatic ductal adenocarcinoma (PDAC). Models were compared with static burden features (demographics, comorbidities, raw lab measurements) and in addition with STDP transitional feature sets using patient-level cross-validation and rolling prediction horizons. In AKI, a calibrated STDP ensemble model showed higher discrimination than static burden alone at the 24-hour decision snapshot for AKI by 72 hours, with AUROC 0.838 versus 0.800, and at 48 hours for near-term AKI prediction, with AUROC 0.868 versus 0.827. In PDAC, STDP transition features modestly improved Day -30 preoperative recurrence prediction, with AUROC 0.611 versus 0.587 and AUPRC 0.323 versus 0.318 for static burden and showed similar performance at Day 0 (7 days before recorded surgery date), with AUROC 0.681 and AUPRC 0.363. Decision-curve and feature analyses suggested that selected temporal transitions were clinically interpretable across renal, inflammatory, hepatobiliary, hematologic, glycemic, and nutritional trajectories. These findings suggest that STDP-inspired transition features may provide a practical, interpretable way to incorporate temporal ordering into EHR-based risk prediction across both acute and longitudinal settings
Belhadj, N. B.; Mezghich, M. A.; Fattahi, J.; Ghayoula, R.; Latrach, L.
Show abstract
Diabetic retinopathy (DR) is the leading cause of preventable blindness in working-age adults, affecting an estimated 103 million people worldwide. Standard deep learning classifiers treat fundus images as independent samples, ignoring latent inter-patient relational structure that is most informative at clinically ambiguous intermediate severity levels. We propose a topology-aware, graph-based deep learning framework combining three complementary components: (i) an EfficientNet-B3 convolutional backbone for high-level visual feature extraction; (ii) persistent homology descriptors (H0 and H1) derived from morphologically skeletonised retinal vascular networks, characterising global vascular topology in a noise-robust manner; and (iii) a GraphSAGE graph neural network propagating disease-related information across a population-level similarity graph, refining representations through inductive neighbourhood aggregation. The similarity graph combines cosine similarity on visual features with 2-Wasserstein distance between persistence diagrams. Evaluated on three public benchmarks, the framework achieves 95.5% accuracy on Kaggle DR, 96.1% on Messidor-2, and 94.6% on APTOS 2019, consistently outperforming a strong CNN baseline by 1.5-2.3 percentage points across accuracy, Quadratic Weighted Kappa, and macro-F1. Ablation experiments confirm synergistic contributions of topological feature augmentation and relational graph learning. One-way ANOVA (F > 80, p < 0.001) confirms that DR progression is reflected in global vascular topology across all five severity stages, providing quantitative biological grounding for the framework design. Code and data are publicly available at https://github.com/Nader-BelHadj/plosene.
Lee, H.; Kim, H.
Show abstract
Background: CD276 has been proposed as a candidate gene associated with the biological characteristics of meningioma, but its predictive position and interpretive significance within a transcriptomic classifier have not yet been clearly established. Accordingly, this study aimed to evaluate CD276 stepwise across internal model development, external validation, calibration, decision-analytic assessment, feature stability, and robustness analyses using public transcriptomic cohorts. Methods: The analyses in this study were organized into two interconnected notebooks. In Notebook A, we reconstructed the internal training cohort (GSE183653), evaluated the CD276 single-gene signal, and then developed a transcriptome-wide multigene classifier. We also performed permutation importance, bootstrap confidence interval, label permutation test, repeated cross-validation, CD276 ablation, and internal calibration analyses. In Notebook B, we reproduced the external validation cohort (GSE136661) in a fixed common-gene space, applied train-only recalibration and train-only threshold transfer, and extended the interpretation through decision curve analysis, stability analysis, enrichment analysis, and one-factor-at-a-time robustness analysis. Results: The internal training cohort consisted of 185 samples and 58,830 genes, of which 25 were WHO grade III cases. CD276 expression showed a significant association with WHO grade, but the internal discrimination of the CD276-only baseline was limited (ROC-AUC 0.628, average precision 0.323, balanced accuracy 0.540). In contrast, the initial transcriptome-wide model showed ROC-AUC 0.834 and PR-AUC 0.509, and under 5-fold cross-validation, the canonical fulltranscriptome model and the CD276-forced 5,001-feature branch showed mean ROC-AUC/PR-AUC of 0.854/0.564 and 0.855/0.606, respectively, outperforming the CD276-only baseline at 0.644/0.391. CD276 was not included in the initial 5,000-feature filtered set and ranked 900th among 5,001 features even in the forcibly included 5,001-feature branch. In paired ablation analysis, the performance difference attributable to inclusion of CD276 was effectively close to zero (delta ROCAUC 0.000062, delta PR-AUC 0.000056). Internal calibration analysis showed an overconfident probability pattern (Brier score 0.10501, intercept -1.421392, slope 0.413241). In external validation, the fixed multigene pipeline achieved ROC-AUC 0.928 and PR-AUC 0.335. Train-only recalibration improved calibration metrics while preserving discrimination, and decision curve analysis showed threshold-dependent but limited external utility. Stability analysis showed overlap between core-stable genes and high-impact genes, but CD276 was not supported as a dominant stable core feature and remained in the target-of-interest tier. In robustness analysis, some perturbations preserved the primary interpretation, whereas others revealed transform sensitivity or an alternative high-performing feature-space solution. Conclusions: CD276 is a gene of interest associated with meningioma grade, but it was difficult to interpret it as a strong standalone predictor or a dominant stable classifier feature. In this study, the main basis of predictive performance lay not in CD276 alone but in a broader multigene transcriptomic structure, and probability output needed to be interpreted conservatively with calibration taken into account. These findings position CD276 not as a direct single-gene classifier but as a biologymotivated target-of-interest that should be interpreted within a broader transcriptomic program.
Yoo, J.; Rachim, V. P.; Lee, Y.; Lee, J.; Park, S.-M.
Show abstract
Insulin therapy in type 1 diabetes requires constant dose adjustment based on blood glucose, meals, physiological states, and physical activity. This demanding self-management imposes a substantial burden and increases dosing-error risk, underscoring the need for automated insulin delivery (AID) systems that reduce user intervention. However, many current systems depend on fixed, individualized parameters and may not fully adapt to rapid or unobserved physiological changes. We developed the Dynamic Physiology-Aware Reinforcement learning Controller (DPARC), a zero-shot insulin optimizer that infers latent physiological dynamics from recent continuous glucose monitoring (CGM) and insulin-delivery history without prior personalization, carbohydrate announcements, or preset subject-specific parameters. DPARC uses a rolling 24-hour CGM and insulin-history window, but closed-loop operation can begin after 1 hour of observed data by initializing unobserved history with neutral normalized padding and progressively replacing it with observations. In silico, a single frozen DPARC policy adapted within 1 hour, improved time in range compared with a total daily insulin-conditioned reinforcement learning baseline, and approached the upper-bound performance of a fully personalized model under stochastic unannounced meals with randomized timing, carbohydrate amounts, absorption variability, and meal skipping. In supervised porcine studies under unannounced meals, DPARC maintained high time in range without manual configuration, supporting large-animal feasibility while prospective human evaluation is needed before clinical efficacy can be established. Learned latent representations correlated with physiological markers including insulin sensitivity and plasma insulin concentration, supporting physiological alignment and explanatory anchors. Collectively, these findings support DPARC as a preclinical proof-of-concept zero-shot AID framework for future supervised human evaluation.
Lu, S.; Ruan, X.; Wang, L.; Wang, X.; Sameer, M.; Liu, H.
Show abstract
Although GLP1/GIP receptor agonists demonstrate unprecedented weight loss efficacy, their rapid clinical adoption has revealed significant real-world tolerability challenges. To evaluate their dynamic safety profiles, we developed a macro to micro pharmacovigilance framework by combining global FAERS reports with local UT Physician EHR. Macroscopically, we distilled 17 shared adverse events across the drug class from FAERS with disproportionality analysis. Microscopically, local EHR data (289,655 longitudinal treatment sessions across 71,316 patients) revealed 51.6% of GLP1 sessions terminated within 90 days. Furthermore, temporal stratified logistic regression demonstrated that initial exposure (0 to 30 days) correlated strongly with nausea and vomiting, which attenuated in extended sessions, whereas extended exposure (>2 years) uncovered late onset risks, notably incident hepatic steatosis. Ultimately, this time aware framework reveals that GLP1 safety profiles are profoundly duration dependent, providing critical insights into both acute intolerances and long-term medication safety.
Maniscalco, D.; Robineau, O.; Boelle, P.-Y.; Mailles, A.; Noel, H.; Tarantola, A.; Velter, A.; Colizza, V.
Show abstract
Background. Despite the decline of the 2022 global outbreak, mpox remains an ongoing public health concern, with persistent transmission and emerging viral clades sustaining resurgence risk. Improving preparedness and response is a priority, yet it remains unclear how best pre-exposure vaccination and community response can effectively limit transmission under realistic conditions and whether behavioral adaptation is critical. Methods. We used a data-driven network model of mpox transmission among men who have sex with men in the Paris region, parameterized with sexual behavioral data and calibrated to surveillance data from the 2022 outbreak. We evaluated counterfactual scenarios by varying vaccination timing, rollout speed, prioritization, and behavioral responses. Results. Here we show that, with respect to the 2022 epidemic in the Paris region, vaccination alone delivered at the observed rollout speed would not have reproduced the observed epidemic decline, even if initiated the day of the first European alert, corresponding to 12 days before the first case was reported in France. Achieving comparable control through vaccination alone would have required more than a fourfold increase in rollout speed. Large-scale and long-term reductions in sexual contacts remain instrumental to limit the epidemic size, although earlier vaccination reduces the proportion of MSM needing to change behavior. In contrast, short-term behavioral measures adopted by the vaccinees, such as sexual abstinence during the 14-day immunity-building period, combined with moderately faster vaccine rollout, (+68% for 50% compliance; +34% for 75% compliance) could achieve comparable epidemic control. Targeting individuals with higher sexual activity further improved intervention efficiency. Conclusions. Under realistic reactive vaccination scenarios, mpox control still requires strong behavioral responses. Combining timely vaccination with short-term behavioral change guidance at vaccine administration offers a feasible path to limit transmission and strengthen outbreak preparedness and response.
Inoki, Y.; Horinouchi, T.; Sakakibara, N.; Ishiko, S.; Yamamoto, A.; Aoyama, S.; Kimura, Y.; Ichikawa, Y.; Tanaka, Y.; Kondo, A.; Yamamura, T.; Ishimori, S.; Araki, Y.; Asano, T.; Fujimura, J.; Fujinaga, S.; Hamada, R.; Inoue, N.; Kaito, H.; Kiyota, K.; Kobayashi, A.; Kobayashi, Y.; Kumagai, N.; Miyano, H.; Ohtomo, Y.; Sasaki, S.; Suzuki, R.; Washio, M.; Yamada, Y.; Yamasaki, Y.; Yokoyama, T.; Iijima, K.; Nagano, C.; Nozu, K.
Show abstract
Chronic benign proteinuria (PROCHOB), caused by biallelic pathogenic variants in CUBN, presents in childhood as isolated, asymptomatic tubular proteinuria with preserved long-term kidney function. Because its clinical presentation closely mimics early stage glomerular diseases with moderate proteinuria and without increased urinary {beta}2-microglobulin (uBMG) and 1-microglobulin, numerous patients undergo unnecessary kidney biopsies and receive angiotensin-converting enzyme inhibitors or angiotensin II receptor blockers before genetic testing is considered. Using high-throughput aptamer-based urinary proteomics (SomaScan(R)), we identified urinary myoglobin as a disease-specific biomarker for PROCHOB. We developed and confirmed a diagnostic approach in which the urinary myoglobin-to-creatinine (uMB/Cr) ratio robustly distinguishes PROCHOB from other moderate glomerular proteinuric kidney diseases. Although certain cases of Dent disease causing megalin dysfunction exhibit increased urinary myoglobin levels, PROCHOB and Dent disease can be clearly distinguished based on the uBMG-to creatinine ratio. This biomarker reflects impaired proximal tubular protein reabsorption because of cubilin dysfunction and remains normal in healthy individuals or those with typical glomerular diseases with moderate proteinuria. Our findings establish a noninvasive diagnostic tool for PROCHOB that prompts targeted genetic testing for CUBN variants using the uMB/Cr and urinary uBMG-to-creatinine ratios. This strategy has the potential to transform the clinical diagnostic pathway for isolated proteinuria.
Krepel, J.; Binkyte, R.; Kerkouche, R.; Harries, M.; Klett-Tammen, C. J.; Fritz, M.; Kesselheim, S.; Kuehn, M.; Bazarova, A.; Lange, B.
Show abstract
During the COVID-19 pandemic, reported incidence data played a central role in public health surveillance and in tracking epidemic dynamics, although they provide limited insight into the behavioral, immunological, and socioeconomic drivers of transmission.Population-based seroprevalence studies with linked survey data offer a rich but untapped source of individual-level information that can complement routine surveillance. In this study, we investigate whether aggregated seroprevalence cohort data can be leveraged to predict local COVID-19 incidence and to identify interpretable predictors associated with transmission dynamics. Using data from the Multilocal SeroPrevalence (MuSPAD) study in Germany (2020--2022), we trained multiple machine learning models, including least absolute shrinkage and selection operator (LASSO), vector autoregressive models (VAR), multilayer perceptrons (MLPs), and long short-term memory neural networks (LSTMs), to predict location-specific seven-day incidence rates. Feature importance was assessed using regression coefficients where applicable and model-agnostic explainability methods, including Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP). Across model classes, cohort-derived features enabled accurate prediction of local incidence, with time-aware models achieving the strongest performance. Consistent predictors included prior infection and testing history, employment-related changes, vaccination status, and mask-wearing behavior, highlighting the importance of behavioral and reporting-related signals. While differential privacy introduced modest degradation in predictive performance under strict privacy budgets, SHAP-based explanations remained stable, and LIME-based explanations were more sensitive to privacy-induced noise. These results demonstrate that aggregated cohort data encode meaningful and interpretable signals of population-level transmission dynamics. Population-based serosurveys therefore provide a complementary source of information for predicting local COVID-19 incidence and identifying key drivers of transmission beyond routine surveillance data. Our findings show that integrating interpretable machine learning with privacy-aware analysis enables actionable insights from sensitive cohort data, supporting their use in digital epidemiology and informing data-driven public health decision-making.
Pinero, S. L.; Li, X.; Lee, S. H.; Liu, L.; Li, J.; Le, T. D.
Show abstract
Long COVID affects millions of people worldwide, yet no disease-modifying treatment has been approved, and existing interventions have shown only modest and inconsistent benefits. A key reason for this limited progress is that current computational drug repurposing pipelines do not match well with the clinical reality of Long COVID. These patients often have persistent, multi-systemic symptoms and may already be taking multiple medications, making treatment safety a primary concern. However, most repurposing workflows still treat safety as a downstream filter and rely on disease-associated targets rather than causal drivers. They also assume that the findings of one analysis would generalize across the diverse presentations of Long COVID. We introduce SPLIT, a safety-first repurposing framework that addresses these limitations. SPLIT prioritizes safety at the start of the candidate evaluation, integrates complementary causal inference strategies to identify likely driver genes, and uses a counterfactual substitution design to compare drugs within specific cohort contexts. When applied to cognitive and respiratory Long COVID cohorts, SPLIT revealed three main findings. First, drugs with similar predicted efficacy could have very different predicted safety profiles. Second, the drugs flagged as unfavorable were often different between the two cohorts, showing that drug prioritization is phenotype-specific. Third, SPLIT flagged 18 drugs currently under active investigation in Long COVID trials as having unfavorable predicted profiles. SPLIT provides a practical framework to identify safer, more context-appropriate candidates earlier in the process, supporting more targeted and better-tolerated treatment strategies for Long COVID.
Saad, A. A.; Murthi, S. B.; Boctor, E. M.; Teeter, W. A.; Seam, N.
Show abstract
The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.
Brann, E.; Polle, R.; Cepukaityte, G.; Georgescu, A. L.; Parsons, O.; Molimpakis, E.; Goria, S.
Show abstract
Accessible screening for type 2 diabetes (T2D) is critical, with millions of cases remaining undiagnosed globally. Here, we present the largest known real-world validation study for a speech-based T2D prediction model, trained on speech data from over 21,000 individuals, that works on features extracted from 20-second speech recordings. The model was evaluated in two stages: 1) Against self-reported diagnoses in 7,319 English-speaking participants using AUC, and 2) Against HbA1c blood tests in a subset of 801 participants drawn from the full cohort. Performance was also compared against QDiabetes and in the presence of key confounding variables. The model demonstrated clinically useful predictive capacity on self-reported data (AUC = 0.80 {+/-} 0.03), approaching QDiabetes (AUC = 0.86 {+/-} 0.03). It was robust to most demographic confounds (e.g., age and sex) and medication use, with reduced performance in the presence of comorbidities (e.g., cardiovascular disease and hypertension). At diabetes threshold of HbA1c [≥]48 mmol/mol, the model achieved an AUC of 0.75 ({+/-}0.07). This biomarker-validated speech-based tool demonstrates potential to complement existing methods through accessible, scalable screening requiring only a 20-second speech sample.
Omar, M.; Agbareia, R.; McGreevy, J.; Zebrowski, A.; Ramaswamy, A.; Gorin, M.; Anato, E. M.; Glicksberg, B. S.; Sakhuja, A.; Charney, A.; Klang, E.; Nadkarni, G.
Show abstract
Large language models are increasingly used for clinical guidance while their parent companies introduce advertising. We tested whether pharmaceutical ads embedded in the prompts of 12 models from OpenAI, Anthropic, and Google shift drug recommendations across 258,660 API calls and four experiments probing distinct epistemic conditions. When two drugs were both guideline-appropriate, advertising shifted selection of the advertised drug by +12.7 percentage points (P < 0.001), with some model-scenario pairs shifting from 0% to 100%. Google models were the most susceptible (+29.8 pp), followed by OpenAI (+10.9 pp), while Anthropic models showed minimal change (+2.0 pp). When the advertised product lacked evidence or was clinically suboptimal, models resisted. This reveals a structured vulnerability: advertising does not override medical knowledge but fills the space where clinical evidence is underdetermined. An open-response sub-analysis (2,340 calls across three representative models) confirmed that advertising restructures free-text clinical reasoning: models echoed ad claims at 2.7 times the baseline rate while maintaining high stated confidence and rarely disclosing the ad. Susceptibility was provider-dependent (Google: +29.8 pp; OpenAI: +10.9 pp; Anthropic: +2.0 pp). Because this bias operates within clinically correct answers, it is invisible to accuracy-based evaluation, identifying a class of AI safety vulnerability that standard testing cannot detect.
Schwoebel, J.; Frasch, M.; Spalding, A.; Sewell, E.; Englert, P.; Halpert, B.; Overbay, C.; Semenec, I.; Shor, J.
Show abstract
As health systems begin deploying autonomous AI agents that make independent clinical decisions and take direct actions within care workflows, ensuring patient safety and care quality requires governance standards that go beyond existing medical device frameworks designed for human-in-the-loop prediction tools. This paper introduces the Healthcare AI Agents Regulatory Framework (HAARF), a comprehensive verification standard for autonomous AI systems in clinical environments, developed collaboratively with 40+ international experts spanning regulatory authorities, clinical organizations, and AI security specialists. HAARF synthesizes requirements from nine major regulatory frameworks (FDA, EU AI Act, Health Canada, UK MHRA, NIST AI RMF, WHO GI-AI4H, ISO/IEC 42001, OWASP AISVS, IMDRF GMLP) into eight core verification categories comprising 279 specific requirements across three risk-based implementation levels. The framework addresses critical gaps in health system readiness for autonomous AI including: (1) progressive autonomy governance with clinical accountability, (2) tool-use security for agents that independently access EHRs, medical devices, and clinical systems, (3) continuous equity monitoring and bias mitigation across diverse patient populations, and (4) clinical decision traceability preserving human oversight authority. We validate HAARFs enforcement capabilities through a scenario-based red-team evaluation comprising six adversarial scenarios executed under baseline (no middleware) and HAARF- guardrailed conditions (N = 50 trials each, Gemini 2.5 Flash primary with Claude Sonnet 4.6 cross-model validation). In baseline conditions, the agent model executes unauthorized tools in 56-60% of adversarial trials. Under the HAARF condition, deterministic middleware enforcement reduces the unauthorized-tool success rate to 0%, with 0% contraindication misses and 0% policy-injection success (95% Wilson CI [0.00, 0.07]). Cross-model validation confirms identical security metrics, supporting HAARFs model-agnostic design. Mapping analysis demonstrates 48-88% coverage of major regulatory frameworks, with per-category FDA alignment ranging from 73% (C5, Agent Registration) to 91% (C3, Cybersecurity; C7, Bias & Equity). Initial validation with healthcare organizations shows a 40-60% reduction in multi-jurisdictional compliance burden and improved clinical safety governance outcomes. HAARF provides health systems with a practical, risk-stratified pathway for safe AI agent deployment--shifting from reactive compliance to proactive quality governance while maintaining rigorous patient safety standards and human-centered care principles.
Parlatan, U.; Patel, A. N.; Torun, H.; Karim, A. H.; Ozen, M. O.; Palaniappan, L.; Demirci, U.
Show abstract
AimsTo characterize subtype-associated heterogeneity in type 2 diabetes mellitus (T2DM), particularly normal-weight diabetes, using extracellular vesicle (EV)-associated molecular features in a clinically stratified cohort. MethodsEVs were isolated from plasma using ExoTIC and validated by transmission electron microscopy, nanoparticle tracking analysis, flow cytometry, and Western blotting. EVs from Asian normal-weight (A-NWD), Asian overweight (A-OWD), Non-Hispanic White normal-weight (W-NWD), and Non-Hispanic White overweight (W-OWD) T2DM patients were analyzed by multimodal surface-enhanced Raman spectroscopy (SERS; n=65) and EV-RNA sequencing (n=39). ResultsSERS identified subgroup-associated spectral fingerprints that distinguished the four BMI- and race/ethnicity-defined groups in this cohort. EV-RNA sequencing revealed differential microRNA expression across subgroups, with higher miR-208a and miR-132 in A-OWD and higher miR-484 in A-NWD. Unsupervised analyses also showed partially overlapping EV-associated molecular features between A-NWD and W-OWD, suggesting that BMI-based subgrouping alone may not fully capture shared metabolic states. ConclusionsMultimodal EV profiling identified subgroup-associated spectral and miRNA features in clinically stratified T2DM and provides a framework for studying diabetes heterogeneity, including molecular patterns associated with normal-weight diabetes.
Dong, Z.; Kethireddy, S.; Kim, D.; Ting, P.; Lal, B.; Lee, K.; Kim, D.-H.; Ahn, E. H.
Show abstract
Glioblastoma (GBM) lethality arises from aggressive invasion and diffuse infiltration of brain tissue. Conventional GBM preclinical models often fail to predict clinical therapeutic efficacy because they do not recapitulate the pathological extracellular matrix (ECM) cues that drive tumor invasion. Here, we present an ECM mimetic 3D platform using a fibrin scaffold to recapitulate the hemorrhagic, pro-thrombotic tumor microenvironment characteristic of high-grade gliomas. This fibrin scaffold induces a pro-invasive phenotype in GBM spheroids by upregulating proliferation/cell cycle- (MYC, FOXOM1, CCND1) and invasion-associated-(CTSS, FOXM1, CCND1) genes. Traditional cell morphology quantification methods (e.g., circularity) distil complex shapes into singular metrics and cannot capture the nuances of invasion. To address this limitation, we have applied a deep-learning segmentation pipeline (MARS-Net) and high-content morphodynamic descriptors. By using the Preserving Heterogeneity (PHet) algorithm, the 3D platform accurately classifies invasiveness levels and captures the invasion-inhibitory effects of potential repurposable drug candidates. We demonstrate that our model can predict a spheroids long-term invasive fate with high accuracy using only partial image sets from early time-points, rather than the complete time-course images. Our work presents an in vivo-like, scalable 3D platform integrated with a quantitative high-throughput pipeline to elucidate GBM invasion mechanisms and to evaluate anti-invasive compounds.
Thong, P. M.; Hu, T. H.; Ooi, J. S. G.; Loh, F. K.; Lee, H.; Bai, C.; Chong, H. T.; Chang, A. J. W.; Choong, C. V.; Galamay, L.; Beh, D. L. L.; Ang, A. X. Y.; Lum, L. H. W.; Yang, S. P.; Lim, A. Y. L.; Mok, S. F.; Vallejo, A. F.; Kao, S. L.; Chan, K. R.; Ong, C. W. M.
Show abstract
Background: Diabetes mellitus (DM) worsens pulmonary tuberculosis (TB) and drives systemic hyper-inflammation, but the underlying mechanisms remain unknown. Neutrophils have key roles in TB immunopathology and lung cavitation. Here, we determine the role of neutrophils in DMTB patients and in driving TB immunopathology. Methods: Sputum and plasma from 30 TB and 30 DMTB patients were analysed for proteases and cytokines using Luminex bead array. Whole blood transcriptomics identified transcriptional differences. Single-cell RNA sequencing characterised neutrophil subsets and dysregulated pathways. Neutrophil function of poorly-controlled DM patients (HbA1c>8%) and healthy controls (HC) were examined following Mycobacterium tuberculosis stimulation, including reactive oxygen species (ROS), neutrophil extracellular traps (NETs), and phagocytosis. Pathways were interrogated using chemical inhibitors, protein array and western blot. Results: Compared to non-diabetic TB patients, poorly-controlled DMTB patients showed up-regulated sputum MMP-8 and MMP-9, associated with increased collagen-destruction and lung cavity formation. Circulating neutrophil count and neutrophil-derived plasma MMP-8 were up-regulated, alongside transcriptional enrichment of extracellular matrix degradation and inflammatory pathways including TNF and RAGE. Single-cell profiling identified reduced cycling neutrophil subset and myelocytes in DMTB, with overall reduced antibacterial and cell-killing signatures. Ex vivo mycobacterial stimulation of DM neutrophils increased ROS and MMP-9 with impaired NETs and delayed phagocytosis. TNFR1, TNFR2, and RAGE were up-regulated. RAGE inhibition with rosiglitazone mitigated Mtb-induced ROS and MMP-8 release. Conclusion: DM worsens neutrophil-driven tissue destruction and inflammation in TB via dysregulated TNF and RAGE-signalling, priming neutrophils towards immunopathology. Targeting RAGE alongside tight glycaemic control may dampen neutrophil hyper-inflammatory responses to limit tissue destruction.